Method and system for storing a file on a plurality of servers

ABSTRACT

The present invention relates to a method for storing a file on a plurality of servers, wherein the number of servers is n and the maximum number of servers which might be fail is t, preferably including a predefined number b of byzantine failures and a number t−b of crashes of the servers, and wherein n equals 2t+b+1, comprising the steps of
     a) Dividing the file into a plurality of chunks, wherein the number of chunks is equal to or greater than the number of servers n,   b) Sending n chunks of the file to the n servers, wherein one chunk is sent to each server,   c) Determining the number of replies r from the n servers indicating successful storage of the respective chunks,   d) Checking if the number of replies r matches a terminating condition, and if not   e) Generating a new file based on one or more chunks of the old file, a reconstruction threshold of the old file and the number of replies,   f) Perform steps a)-e) with the new file as file for these steps, until the terminating condition in step d) is fulfilled,
 
wherein the terminating condition is based on the difference between the reconstruction thresholds of the new file of step e) and the old file of step a) and the maximum number of servers which might be fail.
   

     The present invention further relates to a system for storing a file on a plurality of servers.

The present invention relates to a method for storing a file on aplurality of servers wherein the number of servers is n and the maximumnumber of servers which might be fail is t, preferably including apredefined number b of byzantine failures and a number t−b of crashes ofthe servers, and wherein n equals 2t+b+1.

The present invention further relates to a system for storing a file ona plurality of servers, wherein the number of servers is n and themaximum number of servers which might be fail is t, preferably includinga predefined number b of byzantine failures and a number t−b of crashesof the servers and wherein n equals 2t+b+1, comprising the plurality ofservers and a writer for writing the file onto the servers, preferablyfor performing with a method according to one of the claims 1-13.

In distributed storage systems, for example a RAID system, a file isdispersed on a plurality of servers, in case of a RAID system on aplurality of hard discs. The file is dispersed in such a way that forexample in a RAID system when a hard disc fails or more general issimply unavailable, the number of dispersed file fragments on theremaining hard discs is large enough to restore or reconstruct thedispersed file from the file parts stored on the remaining operatinghard discs.

Unavailability of entities like servers in distributed computing systemsor for example hard discs in a RAID system can be distinguished intobyzantine failures and crashes. Byzantine failures are arbitrary faultsoccurring for example during an execution of an algorithm by thedistributed system. When a byzantine failure has occurred thedistributed system may respond in an unpredictable way. Byzantinefailures may e.g. arise from malware or hackers that attack storageservers or from manufacturer faults.

The other type of failure is a crash leading to unavailability at leasttemporarily. A crash may also be a intended shutdown of a server, forexample for maintenance reasons.

However unavailability of entities in distributed systems occurs onlyoccasionally. Such a worst case scenario would include unpredictablemessage delays, for example due to a network partition or a swampedserver. In most cases the distributed system is functioning: Thecommunication is synchronous and messages are delivered within respectedtime bounds in the distributed system Further a distributed computingsystem is conventionally configured to tolerate a large number of serverfailures although the occurrence of actual failures is rather low.

Conventional storage protocols like byzantine storage protocolsdescribed in James Hendricks, Gregory R. Ganger, and Michael K. Reiter.2007, “Low-overhead byzantine fault-tolerant storage”, in Proceedings oftwenty-first ACM SIGOPS symposium on Operating systems principles (SOSP'07) proposes handling of worst case scenarios. One of the disadvantagesis however that a large overhead communication with respect to theinformation exchanged is necessary leading to a high blow-up factor Afurther disadvantage is that the proposed methods therein are inflexiblerelating only to byzantine failures of servers.

It is therefore an objective of the present invention to provide amethod and a system for storing a file on a plurality of servers withreduced communication overhead for dispersing the file onto theplurality of servers.

It is a further objective of the present invention to provide a methodand a system for storing a file on a plurality of servers providing anoptimized blow up factor with respect to the amount of information sent.

It is an even further objective of the present invention to provide amethod and a system for storing a file on a plurality of servers whichare more flexible, in particular with regard to failure types and/orerror correction codes.

The aforementioned objectives are accomplished by a method of claim 1.

In claim 1 a method for storing a file on a plurality of servers isdefined, wherein the number of servers is n and the maximum number ofservers which might be fail is t, preferably including a predefinednumber b of byzantine failures and a number t−b of crashes of theservers, and wherein n equals 2t+b+1.

According to claim 1 the method is characterized by the steps of

-   a) Dividing the file into a plurality of chunks, wherein the number    of chunks is equal to or greater than the number of servers n,-   b) Sending n chunks of the file to the n servers, wherein one chunk    is sent to each server,-   c) Determining the number of replies r from the n servers indicating    successful storage of the respective chunks,-   d) Checking if the number of replies r matches a terminating    condition, and if not-   e) Generating a new file based on one or more chunks of the old    file, a reconstruction threshold of the old file and the number of    replies,-   f) Perform steps a)-e) with the new file as file for these steps,    until the terminating condition in step d) is fulfilled,    wherein the terminating condition is based on the difference between    the reconstruction thresholds of the new file of step e) and the old    file of step a) and the maximum number of servers which might be    fail.

The aforementioned objectives are also accomplished by a system of claim14.

In claim 14 a system for storing a file on a plurality of servers isdefined, wherein the number of servers is n and the maximum number ofservers which might be fail is t, preferably including a predefinednumber b of byzantine failures and a number t−b of crashes of theservers, and wherein n equals 2t+b+1, comprising the plurality ofservers and a writer for writing the file onto the servers, preferablyfor performing with a method according to one of the claims 1-13.

According to claim 14 the system is characterized by dividing means,preferably a writer, operable to divide the file into a plurality ofchunks, wherein the number of chunks is equal to or greater than thenumber of servers n, to sending means, preferably the writer, operableto send n chunks of the file to the n servers, wherein one chunk is sentto each server, and determining means, preferably the writer, operableto determine the number of replies r from the n servers indicatingsuccessful storage of the respective chunks, checking means, preferablythe writer, operable to check whether the number of replies r matches aterminating condition, generating means, preferably the writer, operableto generate a new file based on one or more chunks of the old file, areconstruction threshold of the old file and the number of replies, andrecursive means operable to operate recursively the dividing means, thesending means, the determining means, the checking means and thegeneration means with the new file as file, until the terminatingcondition is fulfilled, wherein the terminating condition is based onthe difference between the reconstruction thresholds of the new file andthe old file and the maximum number of servers which might be fail.

According to the invention it has been recognized that synchrony isexploited to achieve a communication blow-up factor which is optimal.For example the blow-up factor is at most n/(r−t) where r is the numberof responsive servers with n−t≦r≦n in a synchronous execution. In anasynchronous execution a low bound of n/(n−2t) is achieved. A blow upfactor of n/(r−t) is optimal because the file must be recoverabledespite a number of failed servers t out of the r responsive servers.

According to the invention it has been further recognized thatflexibility is enhanced since generic erasure coding schemes may beused, i.e. the invention is not limited or restricted to a specific typeof erasure coding technique. Further both crashed servers and byzantineservers and/or a combination of both may be respected.

In other words synchrony is exploited for providing a method and asystem for a distributed storage with an optimal communication andblow-up factor and enables to recursively apply erasure coding on dataparameters in particular by the number of received replies as areconstruction threshold.

Further features, advantages and preferred embodiments are described inthe following subclaims.

According to a preferred embodiment in step a) when a total number ofgenerated chunks is greater then the number of servers, the number ofchunks generated in addition to the n chunks is dependent on the numberof servers which might fail and a reconstruction threshold for the file.This enables to generate in addition to n so-called main chunks a numberof auxiliary chunks. These auxiliary chunks are then preferablyconstructed in one step together with the main chunks allowing the useof verification techniques for examples to calculate cross check sumsaccompanying the chunk in a reply. This enables to verify if thecorresponding chunk on the server was somehow modified.

According to a further preferred embodiment for the first performing ofstep a) the reconstruction threshold is based on an estimated number ofresponsive servers. This enables to provide a number of chunks for thefile which are sufficient to reconstruct the file based on the number ofresponsive servers. For example the estimation may be performed onhistoric data on availability of a server or any other data indicatingresponsiveness like communication between a writer and the respectiveservers or the like.

According to a further preferred embodiment the estimated number isgreater than a sum of byzantine servers b and servers t−b that mightcrash. If the number of responsive servers is r, the number of serversthat might be fail is t and the number of byzantine servers is b thenthe number of responsive servers is represented by r≧t+b+1.

According to a further preferred embodiment the total number of chunksin step a) is 2 t+th, wherein t is the maximum of servers that might befail and th is the reconstruction threshold for the corresponding file.One of the advantages is, that then the total number of chunks may bedifferent in each round of steps a)-f), so that reconstruction of theoverall file, i.e. the original file, is ensured.

According to a further preferred embodiment the one or more chunks forgeneration of the new file are solely based on the one or more of theprior non-sent remaining chunks. This enables to only count the numberof responsive servers but not to determine which chunk of the alreadysent chunks was stored successfully. Thus, efficiency is enhanced.

According to a further preferred embodiment chunks in addition to thegenerated n chunks are only generated to the extent that the terminationcondition is not matched. This further leverages efficiency, since—whenthe number of responsive servers is high—only a few chunks are used forthe next round whereas if the number of responsive servers is low thenmore chunks are generated to ensure consistency for reconstruction. Toprovide chunks to the extended termination condition is not matched,i.e. to create further chunks on demand, rateless codes such as onlinecodes, described in P. Maymounkov, “Online Codes”, Technical ReportTR2002-833, Technical report, New York University, 2002, which isincorporated by reference herein, can be used to create them on demand.

According to a further preferred embodiment the generation of the newfile is performed by concatenating one or more chunks. This allows in aneasy and efficient way to generate a new file.

According to a further preferred embodiment a timeout threshold is usedwhen determining the number of replies according to step c), preferablywherein the timeout threshold starts after a predetermined number ofreceived replies, preferably wherein the predetermined numbercorresponds to the number of replies needed for reconstruction of thefile. When a timeout threshold is used a termination condition forwaiting for the number of replies of responsive servers is enabled. Thisallows to further perform the next step even if a certain number ofservers has not replied yet. If preferably the timeout threshold startsafter a predetermined number of received replies it is ensured that atleast it is waited until a number of replies is received. Preferablythis number corresponds to the number of replies needed forreconstruction of the file. In this case it is ensured that the numberof replies of responsive servers is high enough to reconstruct the file.Thus resending of the already sent chunks to all servers is notnecessary and the next step may be performed.

According to a further preferred embodiment the timeout threshold isdynamically adapted in each round of steps a)-e), preferably wherein theadaption is based on connection conditions between a writer of the fileand the servers and/or of server conditions. By dynamically adapting thetimeout threshold, flexibility in general is enhanced. Further a dynamicadaption of the timeout threshold enables an optimized waiting time fora number of replies of responsive servers in each round. If the adaptionis based on information concerning connection conditions or serverconditions this may be used for adapting the timeout threshold: Forexample if a server usually has a low latency time for responding butduring one round a load of the server is increasing, then the timeoutmay be increased in the next steps to ensure that a predefined number ofadditional replies is received in any case. Preferably latency may beused for determining a timeout threshold by analyzing historic latencytimes and then taking for example the timeout in such a way that in 95%the servers reply within that time period.

According to a further preferred embodiment adaption information isencoded in the replies from the servers. Therefore a writer may analyzethe reply and extract the necessary information for adapting thetimeout. Additional data traffic after the reply is avoided.

According to a further preferred embodiment the termination condition isfulfilled if either the reconstruction threshold for file of an actualround with regard to rounds of steps a)-e) is greater than or equal tothe reconstruction threshold of the prior round or the number of roundshas exceeded the number of servers that might fail. This enables that inany case after t+1 rounds the original file is stored on the servers insuch a way, that it can be reconstructed from the servers even if tservers have failed. If the reconstruction threshold of the actual roundis greater than or equal to the reconstruction threshold then the fileis stored on enough servers so that it can be correctly reconstructed byreaders later on. Then the write operation is completed.

According to a further preferred embodiment the new file is generatedbased on the first one or more chunks of the prior non-sent remainingchunks. This enables a fast selection of chunks for the new file.

There are several ways how to design and further develop the teaching ofthe present invention in an advantageous way. To this end it is to bereferred to the patent claims subordinate to patent claim 1 on the onehand and to the following explanation of preferred embodiments of theinvention by way of example, illustrated by the figure on the otherhand. In connection with the explanation of the preferred embodiments ofthe invention by the aid of the figure, generally preferred embodimentsand further developments of the teaching will be explained. In thedrawings

FIG. 1 shows a first embodiment of a method according to the presentinvention and

FIG. 2 shows a second embodiment of a method according to the presentinvention.

FIG. 1 shows a first embodiment of a method according to the presentinvention.

In FIG. 1 the storing of a file F using rateful codes is shown.

For storing the file F with rateful erasure coding a writer estimatesthe number of responsive servers r which is greater than or equal tot+b+1 assuming that n=2t+b+1 servers among which at most b might bebyzantine and the rest t−b may crash are provided. Further a set ofclients is assumed leveraging them for sharing data via read/writefunctionality. Neither servers nor clients communicate with each other.Even further asynchronous transfer respectively communication is assumedin which no assumption is made on the time it takes to transmit amessage between a client and a server.

After the number of responsive servers r is estimated, the writercomputes a reconstruction threshold m₁ such that m₁=r−t.

Then a sequence of k rounds with the following steps are performed,wherein 1≦k≦t+1.

The writer encodes the file F into n′=n+δ₁=2t+b+1+m₁−(b+1) chunks suchthat m₁ of the generated chunks are sufficient to reconstruct the fileF. In the following m₁ is the reconstruction threshold in the firstround k=1. The total number n′ of generated chunks C₁, C₂ is greater orequal than the total number of servers n. Therefore the chunks C₁, C₂can be divided into n main chunks C₁ and n′−n=m₁−(b+1) auxiliary chunksC₂. The blow-up factor BF₁ is then n/m₁.

The writer then selects the first n chunks C₁, i.e. the main chunks C₁,and sends them to the servers S₁, S₂, . . . , S_(n) one to each serverS₁, S₂, . . . , S_(n) denoted with reference sign W₁ and W_(k) for roundk. Since some of the servers S₁, S₂, S₃, . . . , S_(n) may beunresponsive, the writer counts the number of replies r it receives,wherein the replies are indicated with reference sign R₁ and R_(k) forround k.

In order to avoid blocking by failed servers, the writer waits for anexpiry of a predetermined time period, preferably after the writer hasalready received n−t replies from the servers S₁, S₂, S₃, . . . , S_(n),wherein receiving of n−t replies is sufficient for reconstruction of thefile F.

After receiving the number r₁ of replies in the first round k=1 then thereconstruction threshold m₂ for the next round k=2 is set to m₂=r₁−t.This ensures that when the number of replies received from the serversS₁, S₂, S₃, . . . , S_(n) is greater than the estimated number ofresponsive servers, i.e. m₂≧m₁, and the file F is stored at enoughservers S₁, S₂, S₃, . . . , S_(n), so that it can be correctlyreconstructed by readers later. In this case the write operation isalready completed.

If the number of replies r₁ is smaller than the number of estimatedreplies r from the initial step then additional chunks are needed to bestored into the servers S₁, S₂, S₃, . . . , S_(n) for the file F to berecoverable.

These additional chunks are selected by taking the first chunks of theauxiliary chunks C₂₁, i.e. the first m₁−m₂ chunks among the m₁−(b+1)constructed auxiliary chunks C₂. The writer concatenates the first m₁−m₂auxiliary chunks C₂ and proceeds to round k=2 with the concatenatedchunks forming a new file Δ₂ with smaller size than the original file F.

In round k=2 instead of the file F the generated file Δ₂ consisting ofthe m₁−m₂ concatenated auxiliary chunks C₂ is encoded into n′=2 t+m₂chunks so that the file Δ₂ can be reconstructed with m₂ chunks, whereinm₂=r₁−t. These resulting chunks of the file Δ₂ are smaller than theauxiliary chunks C₂ of the preliminary round k=1.

These chunks of smaller side, i.e. the first n chunks of the generatedn′ chunks in the second round k=2 are then (re)sent to the n servers S₁,S₂, S₃, . . . , S_(n) one chunk to each server S₁, S₂, S₃, . . . ,S_(n). Similarly the writer then counts the number of replies r₂received. The reconstruction threshold for the next round m₃ is then setto m₃=r₂−t.

After that, it is checked whether m₃ is greater than or equal to m₂ orm₃≧m₂ respectively. If this terminating condition TC is fulfilled then Fis finally stored on enough servers S₁, S₂, S₃, . . . , S_(n), so thatit can be correctly reconstructed by readers later. If the terminatingcondition TC is not fulfilled, i.e. m₃<m₂ then a further round k=₃ isperformed as long as either the maximum number of iterations k isreached, i.e. k=t+1 or m_(k)≧m_(k−1). For example in the worst case m₁is set to t+b+1 and in the second step m₂ is set to m₁−1, . . .m_(k)=m₁−k−1. If k is equal to t+1 then m_(t+1)=m₁−t=b+1. Since n−tservers S₁, S₂, S₃, . . . , S_(n) are responsive, i.e. n−t replies areprovided, the file F can be encoded such that it can be recovered fromn−2t=b+1 chunks ensuring its reconstructability.

Therefore in round k the file Δ_(k) is encoded into n′_(k)=2t+m_(k)chunks so it can be reconstructed with m_(k) chunks. n′k is alwaysgreater than or equal to n. The first chunks of the resulting chunks arethen recent to the n servers S₁, S₂, S₃, . . . , S_(n) and the number ofreplies r_(k) received is counted and m_(k+1) is then set to r_(k)−t.

The terminating condition TC is fulfilled if m_(k+1)≧m_(k): F is thenstored at enough servers S₁, S₂, S₃, . . . , S_(n), so that it can becorrectly reconstructed later. On the other hand if m_(k+1)<m_(k) thenthe further round k+1 is performed. The reference sign δ₁, δ₂, δ₃, . . .indicates the number of auxiliary chunks when dividing the file Δ_(k) inthe respective round k.

Further in FIG. 1 the corresponding blow-up factors BF_(k) in thecorresponding rounds k are shown. For example in the first round k=1 theblow-up factor BF₁ is equal to n/m₁, in round k=2 the blow-up factor BF₂is (n/m₂) (m₁−m₂)/m₁. Therefore the blow-up factor in round k isBF_(k)=n/m_(k)(m_(k−1)−m_(k))/m_(k−1). The resulting blow-up factor BFcan be computed as the sum of the corresponding blow-up factors duringeach of the k rounds of the write. In detail it is the sum of the numberof bits sent during each rounds of the write over IFI:

$\begin{matrix}{{BF} = {{BF}_{1} + {BF}_{2} + \ldots}} \\{= {{n/m_{1}} + {{n/m_{2}}*{\left( {m_{1} - m_{2}} \right)/m_{1}}} + {{n/m_{3}}*{\left( {m_{2} - m_{3}} \right)/m_{2}}} + \ldots +}} \\{{{n/m_{k}}*{\left( {m_{k - 1} - m_{k}} \right)/m_{k - 1}}}} \\{= {n/m_{k}}}\end{matrix}$

FIG. 2 shows a second embodiment of a method according to the presentinvention.

In FIG. 2 a storing method using rateless erasure codes is shown. Thesteps performed in FIG. 2 are similar to the steps in FIG. 1. However,instead of encoding the files Δ_(k) with reconstruction threshold m_(k)with n+δ_(k) chunks C₁, C₂ prior to sending the first n chunks to theservers S₁, S₂, S₃, . . . , S_(n), the auxiliary chunks C₂ forgenerating the file Δ_(k) are generated after receiving the number ofreplies r from the servers S₁, S₂, S₃, . . . , S_(n), so that theauxiliary chunks C₂ for the file Δ_(k) are only generated to the extentthat they are necessary. This is the main difference to the method shownin FIG. 1: In FIG. 1 in round k=1 the writer creates a set of auxiliarychunks C₂ for the case that there are less available servers r₁ thanestimated denoted with r. If the estimate was correct, which is thecommon case, the resources spent to create these additional chunks C₂are wasted since they are not used for storing.

In FIG. 2 these auxiliary chunks C₂ are created only if necessary: Theadditional chunks C₂ are generated by performing an encoding procedurewith the file Δ_(k−1), the reconstruction threshold m_(k−1) and thenumber of chunks to be created m_(k−1)−m_(k). These m_(k−1)−m_(k)created new chunks C₂ form then the file Δ_(k) for the next step. Afurther difference is, that this generated file Δ_(k) is then onlyencoded into n chunks corresponding to the number of servers. Thereforeinstead of performing an encoding procedure Encode (Δ_(k), m_(k),n+δ_(k)) in each round k the encoding procedure according to FIG. 2 isto produce the chunks for sending them to the servers S₁, S₂, . . . ,S_(n) as follows: Encode (Δ_(k), m_(k), n). The resulting blow-up factorBF corresponds to the blow-up factor of FIG. 1 and amounts exactlyn/m_(k) in total.

In summary the present invention leverages synchrony in order toconstruct an asynchronous distributed storage protocol with an optimalcommunication blow-up factor. The present invention further recursivelyapplies erasure coding on data parameters by a number of receivedreplies as a reconstruction threshold.

Further the present invention minimizes the overhead with respect toinformation exchanged over the network in existing asynchronousdistributed storage protocols. The present invention further applies togeneric erasure coding schemes and is not particularly restricted to aspecific type of erasure coding technique, in particular applies to bothcrash and byzantine models.

The present invention has inter alia the following advantages: Thepresent invention provides an optimal blow-up factor with respect to theamount of information sent over a communication channel. The presentinvention may further be used in conjunction with both rateful andrateless codes. Even further the present invention applies in scenariosfeaturing crash-only servers, byzantine servers or a combination ofcrash-only and byzantine servers.

Many modifications and other embodiments of the invention set forthherein will come to mind the one skilled in the art to which theinvention pertains having the benefit of the teachings presented in theforegoing description and the associated drawings. Therefore, it is tobe understood that the invention is not to be limited to the specificembodiments disclosed and that modifications and other embodiments areintended to be included within the scope of the appended claims.Although specific terms are employed herein, they are used in a genericand descriptive sense only and not for purposes of limitation.

1. A method for storing a file (F) on a plurality of servers (S₁, S₂, .. . , S_(n)), wherein the number of servers (S₁, S₂, . . . , S_(n)) is nand the maximum number of servers (S₁, S₂, . . . , S_(n)) which might befail is t, preferably including a predefined number b of byzantinefailures and a number t−b of crashes of the servers (S₁, S₂, . . . ,S_(n)), and wherein n equals 2t+b+1, characterized by the steps of a)Dividing the file (F, Δ₁) into a plurality of chunks (C₁, C₂), whereinthe number of chunks (n+δ₁, n) is equal to or greater than the number ofservers n, b) Sending n chunks (C₁) of the file (F) to the n servers(S₁, S₂, . . . , S_(n)), wherein one chunk (C₁) is sent to each server(S₁, S₂, . . . , S_(n)) c) Determining the number of replies r from then servers (S₁, S₂, . . . , S_(n)) indicating successful storage of therespective chunks (C₁), d) Checking if the number of replies r matches aterminating condition (TC), and if not e) Generating a new file (Δ₂)based on one or more chunks (C₁, C₂) of the old file (F), areconstruction threshold (m₁, m₂, . . . ) of the old file (F) and thenumber of replies r, f) Perform steps a)-e) with the new file (Δ₂) asfile for these steps, until the terminating condition in step d) isfulfilled, wherein the terminating condition (TC) is based on thedifference between the reconstruction thresholds of the new file (Δ₁,Δ₂, . . . , Δ_(k)) of step e) and the old file (F) of step a) and themaximum number of servers t which might be fail.
 2. The method accordingto claim 1, characterized in that in step a) when the total number ofgenerated chunks (C₁, C₂) is greater than the number of servers n, thenumber of chunks (δ₁) generated in addition to the n chunks (C₁) isdependent on the number of servers (t) which might be fail and areconstruction threshold (m₁, m₂, . . . ) for the file (Δ₁, Δ₂).
 3. Themethod according to claim 1, characterized in that for the firstperforming of step a) the reconstruction threshold (m₁) is based on anestimated number of responsive servers (S₁, S₂, . . . , S_(n)).
 4. Themethod according to claim 3, characterized in that the estimated numberis greater than a sum of byzantine servers b and servers t−b that mightcrash.
 5. The method according to claim 2, characterized in that thetotal number of chunks in step a) is2t+th, wherein t is the maximum number of servers that might be fail andth is the reconstruction threshold for the corresponding file (F). 6.The method according to claim 1, characterized in that the one or morechunks (C₁) for generation of the new file (Δ₂) are solely based on oneor more of the prior non-sent remaining chunks (C₂).
 7. The methodaccording to claim 1, characterized in that chunks (C₂) in addition tothe generated n chunks (C₁) are only generated to the extent that thetermination condition (TC) is not matched.
 8. The method according toclaim 1, characterized in that the generation of the new file (Δ₂, Δ₃, .. . ) is performed by concatenating one or more chunks (C₂).
 9. Themethod according to claim 1, characterized in that a timeout thresholdis used when determining the number of replies according to step c),preferably wherein the timeout threshold starts after a predeterminednumber of received replies, preferably wherein the predetermined numbercorresponds to the number of replies needed for reconstruction of thefile (F).
 10. The method according to claim 9, characterized in that thetimeout threshold is dynamically adapted in each round of steps a)-e),preferably wherein the adaption is based on connection conditionsbetween a writer of the file and the servers (S₁, S₂, . . . , S_(n))and/or of server conditions.
 11. The method according to claim 10,characterized in that adaption information is encoded in the repliesfrom the servers (S₁, S₂, . . . , S_(n)).
 12. The method according toclaim 1, characterized in that the termination condition (TC) isfulfilled if either the reconstruction threshold (m₂, m₃) for a file (F,Δ₁) of an actual round with regard to rounds of steps a)-e) is greaterthan or equal to the reconstruction threshold (m₁, m₂, . . . ) of theprior round or the number of rounds has exceeded the number of servers(t) that might be fail.
 13. The method according to claim 1,characterized in that the new file (Δ₂, Δ₃) is generated based on thefirst one or more chunks (C₂₁) of the prior non-sent remaining chunks(C₂).
 14. A system for storing a file (F) on a plurality of servers (S₁,S₂, . . . , S_(n)), wherein the number of servers (S₁, S₂, . . . ,S_(n)) is n and the maximum number of servers (S₁, S₂, . . . , S_(n))which might be fail is t, preferably including a predefined number b ofbyzantine failures and a number t−b of crashes of the servers (S₁, S₂, .. . , S_(n)), and wherein n equals 2t+b+1, comprising the plurality ofservers (S₁, S₂, . . . , S_(n)) and a writer for writing the file ontothe servers (S₁, S₂, . . . , S_(n)), preferably for performing with amethod according to claim 1, characterized by dividing means, preferablya writer, operable to divide the file (F, Δ₁) into a plurality of chunks(C₁, C₂), wherein the number of chunks (n+δ₁, n) is equal to or greaterthan the number of servers n, sending means, preferably the writer,operable to send n chunks of the file (F) to the n servers (S₁, S₂, . .. , S_(n)), wherein one chunk (C₁, C₂) is sent to each server (S₁, S₂, .. . , S_(n)), determining means, preferably the writer, operable todetermine the number of replies r from the n servers (S₁, S₂, . . . ,S_(n)) indicating successful storage of the respective chunks (C₁, C₂),checking means, preferably the writer, operable to check if the numberof replies r matches a terminating condition (TC), generating means,preferably the writer, operable to generate a new file (Δ₂, Δ₃) based onone or more chunks (C₁, C₂) of the old file (F, Δ₁), a reconstructionthreshold (m₁, m₂, . . . ) of the old file and the number of replies r,and recursive means operable to operate recursively the dividing means,the sending means, the determining means, the checking means and thegeneration means with the new file (Δ₂) as file, until the terminatingcondition is fulfilled, wherein the terminating condition (TC) is basedon the difference between the reconstruction thresholds of the new file(Δ₂, . . . , Δ_(k)) and the old file (F) and the maximum number ofservers t which might be fail.